Skip to content

[Feature] Reconsider prompts for GRPO #3030

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 28 commits into
base: main
Choose a base branch
from
Open

[Feature] Reconsider prompts for GRPO #3030

wants to merge 28 commits into from

Conversation

vmoens
Copy link
Collaborator

@vmoens vmoens commented Jun 30, 2025

No description provided.

Copy link

pytorch-bot bot commented Jun 30, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/3030

Note: Links to docs will display an error until the docs builds have been completed.

❌ 11 New Failures, 3 Pending, 3 Unrelated Failures

As of commit cc647b2 with merge base 2c45bde (image):

NEW FAILURES - The following jobs have failed:

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jun 30, 2025
@vmoens vmoens changed the title "[Feature] Reconsider prompts for GRPO" [Feature] Reconsider prompts for GRPO Jul 1, 2025
@vmoens vmoens added the enhancement New feature or request label Jul 2, 2025
@vmoens vmoens force-pushed the grpo-thinking branch 2 times, most recently from 39772aa to 8151cf4 Compare July 7, 2025 15:48
vmoens added 3 commits July 7, 2025 17:27
[Feature] Add thinking prompts to GRPO

amend

aned

aned

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend

amend
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants